Semi-Supervised Learning Based on Semiparametric Regularization

نویسندگان

  • Zhen Guo
  • Zhongfei Zhang
  • Eric P. Xing
  • Christos Faloutsos
چکیده

Semi-supervised learning plays an important role in the recent literature on machine learning and data mining and the developed semisupervised learning techniques have led to many data mining applications in recent years. This paper addresses the semi-supervised learning problem by developing a semiparametric regularization based approach, which attempts to discover the marginal distribution of the data to learn the parametric function through exploiting the geometric distribution of the data. This learned parametric function can then be incorporated into the supervised learning on the available labeled data as the prior knowledge. Specifically, our contributions are: (1) We present a semi-supervised learning approach which incorporates the unlabeled data into the supervised learning by a parametric function learned from the whole data including the labeled and unlabeled data. The parametric function reflects the geometric structure of the marginal distribution of the data. Furthermore, the proposed approach which naturally extends to the out-of-sample data is an inductive learning method in nature. (2) This approach allows a family of algorithms to be developed based on various choices of the original RKHS and the loss function. (3) We provide experimental comparisons showing that the proposed approach leads the state-of-the-art performance on a variety of classification tasks. In particular, we demonstrate that this approach can be used successfully in both transductive and semisupervised settings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised Regression with Order Preferences

Following a discussion on the general form of regularization for semi-supervised learning, we propose a semi-supervised regression algorithm. It is based on the assumption that we have certain order preferences on unlabeled data (e.g., point x1 has a larger target value than x2). Semi-supervised learning consists of enforcing the order preferences as regularization in a risk minimization framew...

متن کامل

Interactive Segmentation in Multimodal Medical Imagery using a Bayesian Transductive Learning Approach

Labeled training data in the medical domain is rare and expensive to obtain. The lack of labeled multimodal medical image data is a major obstacle for devising learning-based interactive segmentation tools. Transductive learning (TL) or semi-supervised learning (SSL) offers a workaround by leveraging unlabeled and labeled data to infer labels for the test set given a small portion of label info...

متن کامل

SERBoost: Semi-supervised Boosting with Expectation Regularization

The application of semi-supervised learning algorithms to large scale vision problems suffers from the bad scaling behavior of most methods. Based on the Expectation Regularization principle, we propose a novel semi-supervised boosting method, called SERBoost that can be applied to large scale vision problems. The complexity is mainly dominated by the base learners. The algorithm provides a mar...

متن کامل

Statistical Analysis of Semi-Supervised Regression

Semi-supervised methods use unlabeled data in addition to labeled data to construct predictors. While existing semi-supervised methods have shown some promising empirical performance, their development has been based largely based on heuristics. In this paper we study semi-supervised learning from the viewpoint of minimax theory. Our first result shows that some common methods based on regulari...

متن کامل

Transductive Classification via Dual Regularization

Semi-supervised learning has witnessed increasing interest in the past decade. One common assumption behind semi-supervised learning is that the data labels should be sufficiently smooth with respect to the intrinsic data manifold. Recent research has shown that the features also lie on a manifold. Moreover, there is a duality between data points and features, that is, data points can be classi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008